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Interactions between units in phyical, biological, technological, and social sys- 
tems usually give rise to intrincate networks with non-trivial structure, which 
critically affects the dynamics and properties of the system. The focus of most 
current research on complex networks is on global network properties. A 
caveat of this approach is that the relevance of global properties hinges on 
the premise that networks are homogeneous, whereas most real-world net- 
works have a markedly modular structure. Here, we report that networks 
with different functions, including the Internet, metabolic, air transportation, 
and protein interaction networks, have distinct patterns of connections among 
nodes with different roles, and that, as a consequence, complex networks can 
be classified into two distinct functional classes based on their link type fre- 
quency. Importantly, we demonstrate that the above structural features can- 
not be captured by means of often studied global properties. 

The structure of complex networks-^ is typically characterized in terms of global prop- 
erties, such as the average shortest path length between nodes 3 , the clustering coefficient 3 , 
the assortativity 4 and other measures of degree-degree correlations^, and, especially, the de- 
gree distribution 78 . However, these global quantities are truly informative only when one of 
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two strict conditions is fulfilled: (i) the network lacks a modular structur e 9 1 1 ° 1 11 1 12 1 13 1 14 > G r (ii) 
the network has a modular structure but (ii.a) all modules were formed according to the same 
mechanisms, and therefore have similar properties, and (ii.b) the interface between modules is 
statistically similar to the bulk of the modules, except for the density of links. If neither of these 
two conditions is fulfilled, then any theory proposed to explain, for example, a scale-free degree 
distribution needs to take into account the modular structure of the network. 

To our knowledge, no real- world network has been shown to fulfill either of the two condi- 
tions above; this implies that global properties may sometimes fail to provide insight into the 
mechanisms responsible for the formation or growth of these networks. Alternative approaches 
that take into consideration the modular structure of real-world complex networks are therefore 
necessary. One such approach is to group nodes into a small number of roles, according to 
their pattern of intra- and inter-module connections 11 ' 12113 . Recently, we demonstrated that the 
role of a node conveys significant information about the importance of the node, and about the 
evolutionary pressures acting on ifr^*^. Here, we demonstrate that modular networks can be 
classified into distinct functional classes according to the patterns of role-to-role connections, 
and that the definition of link types can help us understand the function and properties of a 
particular class of networks. 

Modularity of complex networks 

We analyze four different types of real-world networks — metabolic networks n i 15 i 16 , protein in- 
teractome s 17 i 18 1 19 i 20 > global and regional air transportation networks 13 i 21 i 22 ; and the Internet at 
the autonomous system (AS) level-^ 3 - (Table [Hand Supplementary discussion). To determine 
and quantify the modular structure of these networks, we use simulated annealing 24 to find 
the optimal partition of the network into module s 11112 ' 25 (Methods). We then assess the signifi- 
cance of the modular structure of each network by comparing it to a randomization of the same 
network 25 . We find that all networks studied have a significant modular structure (Tabled)). 
Modules correspond to functional units in biological networks-^2 and to geo-political units in 
air transportation networks 13 and, probably, in the Internet—. 
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To assess whether global average properties are appropriate to describe the structure of 
these networks, we compare global average properties of the networks to the corresponding 
module-specific averages; specifically, we focus on the degree, the clustering coefficient, and 
the normalized clustering coefficient. We find that the average degree of the network is not 
representative of individual-module average degrees for air transportation networks (Table [2]). 
Most importantly, the global clustering coefficient is not representative of individual-module 
clustering coefficients for any network (except, maybe, for one out of 18 metabolic networks). 

Role-based description of complex networks 

As an alternative to the average description approach, we determine the role of each node ac- 
cording to two properties 1112 (Methods): the relative within-module degree z, which quantifies 
how well connected a node is to other nodes in their module, and the participation coefficient 
P, which quantifies to what extent the node connects to different modules. We classify as non- 
hubs those nodes that have low within-module degree (z < 2.5). Depending on the fraction of 
connections they have to other modules, non-hubs are further subdivided into^*^: (Rl) ultra- 
peripheral nodes, that is, nodes with all their links within their own module; (R2) peripheral 
nodes, that is, nodes with most links within their module; (R3) satellite connectors, that is, 
nodes with a high fraction of their links to other modules; and (R4) kinless nodes, that is, nodes 
with links homogeneously distributed among all modules. We classify as hubs those nodes that 
have high within-module degree (z > 2.5). Similar to non-hubs, hubs are divided according 
to their participation coefficient into: (R5) provincial hubs, that is, hubs with the vast majority 
of links within their module; (R6) connector hubs, that is, hubs with many links to most of the 
other modules; and (R7) global hubs, that is, hubs with links homogeneously distributed among 
all modules. 

Although the full rationale for this particular definition of the roles has been given else- 
where 12 , it is important to highlight a few properties of our classification scheme. Nodes in real 
and model networks, especially non-hubs, do not fill uniformly the zP-plane; our role classifi- 
cation scheme arises from the fact that nodes tend to congregate into a small number of densely 
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populated regions of this space, with boundaries between these regions having low density of 
nodes. Additionally, especially for hubs, boundaries coincide with well defined connectivity 
patterns; for example, nodes at the boundary between connector hubs (R6) and global hubs 
(R7) would have approximately half of their links in one module, and the other half perfectly 
spread in other modules. Importantly, other definitions of the roles do not alter the results we 
report below (see Supplementary Information). 

We investigate how our definition of roles relates to global network properties, and to what 
extent global network properties are representative of nodes with different roles. Since some 
simple properties like the degree and the clustering coefficient trivially depend on a node's role, 
we focus on degree-degree correlations 415119 ' 27128 ' 6 . Specifically, we address two questions: (i) 
whether nodes with the same degree but different roles have the same or different correlations; 
and (ii) to what extent the observed degree-degree correlations are a byproduct of the modular 
structure of the network. 

To answer these questions, we start by considering the Internet at the AS level (Fig. [T]). 
Nodes with degree k = 3 can be either ultra-peripheral (Rl, if they have all connections in 
the same module), peripheral (R2, if they have two connections in one module and one in 
another), or satellite connectors (R3, if the three connections are to different modules). A 
separate analysis for each role reveals that the average degree k nn (k) of the neighbors of a 
node 5 with degree k = 3 strongly depends on the role of the node. For an instance of the 1998 
Internet, for example, k nn (k = 3) = 43 ± 8 for ultra-peripheral nodes, k nn (k = 3) = 196 ± 12 
for peripheral nodes, and k nn (k = 3) = 290 ± 20 for satellite connectors. We observe a 
dependence of k nn on the nodes' role for all the networks studied here (Fig.[T^-d). 

Regarding the second question, initial research showed 5 that for the Internet at the AS level 
k nn (k) oc k~ - 5 . It was later pointed out^ 8 - 2 - 7 - that any network with the same degree distribution 
as the Internet should display a similar scaling. In other words, the degree distribution of the 
network is responsible for most of the observed correlations. However, the degree distribution 
alone does not account for all the observed correlations 28 (Fig. In contrast, the modular 
structure of the network does account for most of the remaining degree-degree correlations 
observed in the topology of the Internet (Fig.Q})- Similarly, the modular structure accounts for 
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the degree-degree correlations in metabolic networks and the air transportation network, and 
for most of the correlations in protein interaction networks (Fig. CD-I)- 

Role-to-role connectivity profiles 

The findings we reported so far suggest that, once the degree distribution and the modular struc- 
ture are fixed, real networks have no additional internal structure. This, however, contradicts our 
intuition that networks with different growth mechanisms and functional needs should have dis- 
tinct connection patterns between nodes playing different roles. To investigate this possibility, 
we systematically analyze how nodes connect to one another depending on their roles. 

For each network, we calculate the number of links between nodes belonging to roles % 
and j, and compare this number to the number of such links in a properly randomized network 
(Methods). As in previous work 1 9 ' 29 ' 28 ' 30 , we use the ,2-score to obtain a profile a of over- and 
under-representation of link types (Fig. [2]), which enables us to compare different networks. We 
quantify the overall similarity between two profiles a and b by the scalar product between these 
profiles (Methods). In Fig. [21 we show that networks of the same type have highly correlated 
profiles, while networks of different types have weaker correlations and, at times, even strong 
anti-correlations (Fig.[2fc). 

The networks considered fall into two main classes, one comprising metabolic and air trans- 
portation networks, and another comprising protein interactomes and the Internet. The main 
difference between the two groups is the pattern of links between: (i) ultra-peripheral nodes 
(links of type Rl-Rl), and (ii) connector hubs and other hubs (links of types R5-R6 and R6- 
R6). These link types are over-represented for networks in the first class (except links of type 
R6-R6 in metabolic networks), and under-represented for networks in the second class. 

We denote the first class as the stringy-periphery class (Fig. [3k, b). In networks of this 
class, ultra-peripheral nodes are more connected to one another than one would expect from 
chance, which results in long "chains" of ultra-peripheral nodes. In metabolic networks, these 
chains correspond to loop-less pathways that, for example, degrade a complex metabolite into 
simpler molecules. In the air transportation network, due to the higher overall connectivity 
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of the network, chains contain short loops and resemble "braids." Stringy-periphery networks 
also have a core of hubs, which we call the hub oligarchy, that are directly reachable from 
one another (links of type R5-R6 in metabolic and air transportation networks, and R6-R6 in 
air transportation networks). Moreover, connector hubs are less connected to ultra-peripheral 
nodes (Rl) than expected by chance alone. 

We denote the second class as the multi-star class (Fig. [3]:, d). The multi- star class comprises 
the protein interactomes and the Internet, and has the opposite signature to the stringy-periphery 
class. Links of type Rl-Rl (between ultra-peripheral nodes) are under-represented, whereas 
links of type R1-R5 (between ultra-peripheral nodes and provincial hubs) are, over-represented, 
giving rise to modules with indirectly-connected "star-like" structures. Similarly, connector 
hubs are less connected to one another than one would expect, which means that these networks 
depend on satellite connectors to bridge connector hubs and modules. 

Our findings confirm and clarify previous results in the literature. For example, the under- 
representation of R6-R6 links in protein interactomes is consistent with previous results suggest- 
ing a tendency for hubs to "repel" each other in these networks-^. Similarly, the role-to-role 
connectivity profile of the Internet is consistent with the existence of a hierarchy of types of 
nodes 28 . This hierarchy comprises end users, regional providers, and global providers, which 
we hypothesize correspond correspond to roles R1-R2, R5, and R6 respectively. The role-to- 
role connectivity profiles are consistent with a scenario in which end users connect mostly to 
regional providers, and in which global providers connect with each other indirectly through 
satellite connectors (R3), with few connections but probably large bandwidth. 

By considering the modular structure of the networks and the extra dimension introduced by 
the participation coefficient, however, our approach provides novel insights into the relationship 
between structure and function in complex networks. For example, by considering the absolute 
degree alone nodes with roles R5 and R6 in protein interactomes are indistinguishable from 
each other: in S. cerevisiae, (k)n 5 = 14.0 ± 1.7 and (k) RG = 17.1 ± 1.9, whereas the average 
degree for the whole network is (k) = 2.67 ± 0.09. Still, links R5-R5 between provincial hubs, 
unlike R6-R6 links, are not under-represented. In general, the different connection patterns of 
R5 and R6 (or Rl and R2) proteins enables us to hypothesize that they play distinct biological 
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roles, with R6 proteins likely being much more important—. 

A closer look at the air transportation network also helps to show that important structural 
properties may be left unexplained by focusing on degree alone, as well as to stress the impor- 
tance of the relative within-module degree as opposed to the degree. Johannesburg, in South 
Africa, has degree k =84, which is 23% smaller than the degree of Cincinnati in the U.S., 
k =109. Still, one can fly from most capitals in the world to Johannesburg but not to Cincin- 
nati. There are two main reasons for this. First, while Johannesburg is the most connected city in 
its region (sub-Saharan Africa), Cincinnati (North America) is not; this effect is captured by the 
within-module relative degree, which is 9.3 for Johannesburg and 4.3 for Cincinnati. Second, 
Johannesburg has many connections to other regions, whereas Cincinnati does not; this effect 
is captured by the participation coefficient, which is 0.52 for Johannesburg and 0.05 for Cincin- 
nati. As a result, Johannesburg is a global hub (R6) in our classification, whereas Cincinnati is a 
provincial hub (R5). One can thus understand why R6-R6 connections are over-represented in 
air transportation networks (most global hubs are connected to one another), whereas R5-R5 are 
not (most provincial hubs are poorly connected to provincial hubs in other regions). In general, 
our approach shows why the behavior of R5 and R6 nodes is so different in air transportation 
networks, which cannot be understood from the degree of the nodes alone. 

Conclusion 

We have shown that global properties that do not take into account the modular organization of 
the network may sometimes fail to capture potentially important structural features; although all 
networks (except, maybe, the protein interactomes) show no degree-degree correlations when 
compared to the appropriate ensemble of random networks, they all have clearly distinctive 
properties in terms of how nodes with certain roles are connected to each other. Our results thus 
call attention to the need to develop new approaches that will enable us to better understand the 
structure and evolution of real- world complex networks. 

Additionally, our findings demonstrate that networks with the same functional needs and 
growth mechanisms have similar patterns of connections between nodes with different roles. 
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Attempts to divide complex networks into "classes" or "families" have been made before, for 
example in terms of the degree distribution 8 and in terms of the relative abundance of certain 
subgraphs or motifs 2930 . Our work here complements those attempts, and is the first one to 
build on the crucial fact that most real- world networks display a markedly modular structure. 

Although we cannot put forward a theory for the division of the networks into two classes, 
we hypothesize that it might be related to the fact that networks in the stringy-periphery class 
are transportation networks, in which strict conservation laws must be fulfilled. Indeed, for 
transportation systems it has been shown that, under quite general conditions, a hub oligarchy 
is the the most efficient organization 32 . Conversely, both protein interactomes and the Internet 
can be seen as signaling networks, which do not obey conservation laws. 
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Methods 



Module identification 



The modularity M. (V) of a partition V of a network into modules is— 



Nm \l f d V 



(1) 



where Nm is the number of non-empty modules (smaller than or equal to the number N of 
nodes in the network), L is the number of links in the network, l s is the number of links between 
nodes in module s, and d s is the sum of the degrees of the nodes in module s. The objective 
of a module identification algorithm is to find the partition V* that yields the largest modularity 
M = M. (V*). Note that Nm is only constrained to be Nm < N, but is otherwise selected by the 
optimization algorithm so that M. is maximum. The problem of identifying the optimal partition 
is analogous to finding the ground state of a disordered system with Hamiltonian H = —LAi. 25 
Since the modularity landscape is in general very rugged, we use simulated annealing to 
find a close to optimal partition of the network into modules 11112 ' 25 . This method is the most 
accurate to date — ^. 

Role definition 

We determine the role of each node according to two propertie s 11 ' 12 : the relative within-module 
degree z and the participation coefficient P. The within-module degree z-score measures how 
"well-connected" node i is to other nodes in the module compared to those other nodes, and is 
defined as 



where n\ is the number of links of node % to nodes in module s, Si is the module to which node 
i belongs, and the averages (. . .}j Es are taken over all nodes in module s. 

The participation coefficient quantifies to what extent a node connects to different modules 



(2) 
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We define the participation coefficient Pj of node % as 



N M / 



2 



(3) 



s=l \ 



where is the number of links of node i to nodes in module s, and fcj = J2 S K l i s me tota l 
degree of node i. The participation coefficient of a node is therefore close to one if its links are 
uniformly distributed among all the modules and zero if all its links are within its own module. 

We classify as non-hubs those nodes that have low within-module degree (z < 2.5). De- 
pending on the amount of connections they have to other modules, non-hubs are further subdi- 
vided into-^*^: (Rl) ultra-peripheral nodes, that is, nodes with all their links within their own 
module (P < 0.05); (R2) peripheral nodes, that is, nodes with most links within their module 
(0.05 < P < 0.62); (R3) satellite connectors, that is, nodes with a high fraction of their links 
to other modules (0.62 < P < 0.80); and (R4) kinless nodes, that is, nodes with links homo- 
geneously distributed among all modules (P > 0.80). We classify as hubs those nodes that 
have high within-module degree (z > 2.5). Similar to non-hubs, hubs are divided according to 
their participation coefficient into: (R5) provincial hubs, that is, hubs with the vast majority of 
links within their module (P < 0.30); (R6) connector hubs, that is, hubs with many links to 
most of the other modules (0.30 < P < 0.75); and (R7) global hubs, that is, hubs with links 
homogeneously distributed among all modules (P > 0.75). 

Network randomization and statistical ensembles 

We use two different ensembles of random network s 19 ' 28 . In the first ensemble, which we denote 
by V, we only preserve the degree sequence of the original network; in the second ensemble, 
denoted M., we preserve both the degree sequence and the modular structure of the network. 
Averages over the first and second ensembles are denoted (. . .}z> an d (. . .)m> respectively. 

To generate random networks in ensemble V, we randomize all the links in the network 
while preserving the degree of each node. To uniformly sample all possible networks, we 
use the Markov-chain Monte Carlo switching algorithm^!. In this algorithm, one repeatedly 
selects random pairs of links, for example and (I, m), and swaps one of the ends of each 
link, so that the links become (i, m) and 
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To generate random networks in ensemble M., we restrict the Markov-chain Monte Carlo 
switching algorithm 28 to pairs of links that connect nodes in the same pair of modules, that is, 
we apply the Markov-chain Monte Carlo switching algorithm independently to links whose ends 
are in modules 1 and 1, 1 and 2, and so forth for all pairs of modules. This method guarantees 
that, with the same partition as the original network, the modularity of the randomized network 
is the same as that of the original network (since the number of links between each pair of 
modules is unchanged) and that the role of each node is also preserved. 

To investigate whether global properties are representative of module-specific properties, we 
focus on degree ki, clustering coefficient Cj, and normalized clustering coefficient Ci/(Ci)x>- 
For each module s in the network, comprising n s nodes, we compute the average of each prop- 
erty in the module (for example, (ki) ies ). Additionally, we compute the distribution of such 
averages for random modules, which we obtain by randomly selecting groups of n s nodes. If 
the empirical module average falls outside of the 95% probability of the distribution for the 
random modules, we consider that the global average is not representative of the module aver- 
age. We finally compute the fraction r of modules that are not properly described by the global 
average. 

To study degree-degree correlations, we consider the average degree k l nn of the nearest 
neighbors of each node i. We define the normalized nearest neighbors' degree d l as the ra- 
tio of k l nn and: (i) the average value of k{ n in the network 

Nki 
"j 

where N is the number of nodes in the network; (ii) the expected value of k l na in the ensemble 
of networks with fixed degree sequence 

ui 

and (iii) the expected value of k l nn in the ensemble of networks with fixed degree sequence and 
modular structure 



^ = ^rf, (4) 
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Note that, in spite of the similar notation, the meaning of d\j- is somewhat different from the 
other two because the normalization involves an average over nodes, while in d l v and d % M the 
normalization involves averages over an ensemble of randomized networks. 

To obtain the role-to-role connectivity profiles, we calculate the -z-scor c 19i29 ' 28i3 ° of the num- 
ber of links between nodes with roles % and j as 

z _ r »i ~ ( r ij)M ^ 
^f( r ij)M ~ (rij) 2 M ' 

where is the number of links between nodes with roles % and j. To obtain better statistics and 
an estimation of the error in the z-score, we carry out this process for several partitions of each 
network. 

To evaluate the similarity between two z-score profiles a and b, we use the scalar product 

r ab = E , (8) 

• ■ <J 7 a (J b 

where a> is the standard deviation of the elements in a. 
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Network type 


Network 


Nodes 


Links 


N M 


M 


(M) v 




A. fulgidus 


303 


366 


16 


0.813 


0.746 (0.005) 




A. pernix 


300 


387 


14 


0.797 


0.711 (0.006) 


Metabolism Archaea 


M. jannaschii 
P. aerophilum 


223 
335 


277 
421 


14 
15 


0.813 
0.811 


0.720 (0.003) 
0.731 (0.004) 




P. furiosus 


302 


384 


16 


0.813 


0.720 (0.007) 




S. solfataricus 


367 


455 


17 


0.813 


0.736 (0.006) 




B. subtilis 


649 


863 


20 


0.815 


0.724 (0.003) 




E. coli 


739 


1009 


17 


0.810 


0.7 ll (0.003) 


Metabolism Bacteria 


F. nucleatum 


378 


473 


16 


0.816 


0.734 (0.004) 


H. pylory 


360 


438 


15 


0.837 


0.746 (0.006) 




M. leprae 


451 


578 


16 


0.814 


0.732 (0.005) 




T. elongatus 


448 


546 


17 


0.830 


0.755 (0.006) 




A. tnaliana 


607 


792 


1 o 

18 


0.825 


0.728 (0.003) 




C. elegans 


431 


569 


17 


0.818 


0.714(0.004) 


Metabolism Eukaryotes 


H. sapiens 
P. falciparum 


792 
280 


1056 
363 


23 
12 


0.842 
0.815 


0.727 (0.003) 
0.708 (0.006) 




S. cerevisiae 


570 


776 


17 


0.814 


0.708 (0.003) 




S. pombe 


503 


664 


18 


0.827 


0.721 (0.003) 




Global 


3618 


14142 


25 


0.706 


0.3111 (0.0009) 


Air transportation 


Asia & Middle East 


706 


2572 


10 


0.642 


0.325 (0.002) 




North America 


940 


3446 


12 


0.522 


0.3111 (0.0005) 


Interactome 


S. cerevisiae 


1458 


1948 


25 


0.820 


0.707 (0.002) 


C. elegans 


2889 


5188 


28 


0.688 


0.561 (0.002) 




1998 


3216 


5705 


17 


0.625 


0.5365 (0.0011) 


Internet 


1999 


4513 


8374 


18 


0.620 


0.5227 (0.0007) 




2000 


6474 


12572 


22 


0.631 


0.5042 (0.0008) 



Table 1: Properties and modularity of the studied networks. We show the number of nodes and 
links in the network, the modularity M of the best partition obtained using simulated anneal- 
ing, and the average modularity (M) v (and standard deviation) of the randomizations of the 
network, obtained using the Markov-chain switching algorithm to preserve the degree of each 
node (see Methods). Note that all networks are significantly modular, that is, their modularity 
is larger than the modularity of their corresponding randomizations. 
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Network type 


XT , I 

Network 


r (ki)i 


r {C i ) i 


r (C l /(C l ) I ,) l 




A. fulgidus 


0.02 (0.03) 


0.125 (0.0) 


0.10 (0.03) 




A. pernix 


0.0 (0.0) 


0.17 (0.04) 


0.18 (0.04) 


Metabolism Archaea 


M. jannaschii 
P. aerophilum 


0.0 (0.0) 
0.03 (0.03) 


0.27 (0.03) 
0.22 (0.06) 


0.27 (0.02) 
0.16 (0.05) 




P. furiosus 


0.02 (0.03) 


0.27 (0.04) 


0.24 (0.06) 




S. soljataricus 


0.02 (0.03) 


A "1 C /A A A \ 

0.15 (0.04) 


A 1 1 /A A A \ 

0.11 (0.04) 




B. subtilis 


0.02 (0.02) 


0.22 (0.06) 


0.19 (0.04) 




E. coli 


0.02 (0.04) 


0.27 (0.06) 


0.29 (0.04) 


Metabolism Bacteria 


t. nucleatum 

TT 1 ' 

H. pylori 


0.0 (0.0) 
0.08 (0.05) 


A A f /A \ 

0.06 (0.02) 

A O O /A A A \ 

0.28 (0.04) 


A AA /"A A1\ 

0.06 (0.03) 
0.26 (0.03) 




A AT 1 

M. leprae 


A A /A A\ 

0.0 (0.0) 


A O O /A AC\ 

0.28 (0.05) 


f\ /A A /I \ 

0.27 (0.04) 




1. elongatus 


A A 1 /A AO \ 

0.01 (0.02) 


A 1 1 /A AO\ 

0.11 (0.03) 


0.12 (0.04) 




A. thaliana 


0.04 (0.03) 


0.29 (0.06) 


0.29 (0.07) 




C. elegans 


A C\CA tf\ f\f\A\ 

U.l)o4 (U.UU4) 


n oi /a A'S \ 

U.31 (U.UJ) 


A ^A tf\ A^\ 

U.JU (U.Ui) 


Metabolism Eukaryotes 


H. sapiens 
P. falciparum 


0.08 (0.03) 
0.084 (0.002) 


0.45 (0.04) 
0.23 (0.03) 


0.41 (0.05) 
0.24 (0.02) 




S. cerevisiae 


0.09 (0.04) 


0.24 (0.05) 


0.23 (0.05) 




S. pombe 


0.059 (0.003) 


0.37 (0.06) 


0.36 (0.06) 




Global 


0.41 (0.05) 


0.531 (0.010) 


0.43 (0.02) 


Air transportation 


Asia & Middle East 


0.40 (0.10) 


0.26 (0.04) 


0.21 (0.05) 




North America 


0.37 (0.03) 


0.40 (0.04) 


0.47 (0.05) 


Interactome 


S. cerevisiae 


0.0 (0.0) 


0.25 (0.09) 


0.67 (0.04) 


C. elegans 


0.042 (0.014) 


0.47 (0.06) 


0.33 (0.04) 




1998 


0.064 (0.005) 


0.77 (0.05) 


0.77 (0.06) 


Internet 


1999 


0.0 (0.0) 


0.85 (0.03) 


0.83 (0.05) 




2000 


0.0 (0.0) 


0.77 (0.04) 


0.76 (0.07) 



Table 2: Global versus module- specific average properties. For each network, we show the 
fraction r of modules (and standard deviation) whose average degree (h)i, clustering coef- 
ficient {Ci)i, and normalized clustering coefficient (Ci/(Ci)v)i significantly differ (at a 95% 
confidence) from the global network average (Methods). Fractions r > 0.05 indicate that a 
given global property does not correctly describe individual modules. Global degree is not rep- 
resentative of individual-module degrees for air transportation networks. Most importantly, the 
global clustering coefficient is not representative of individual-module clustering coefficients 
for any network (except, maybe, the metabolic network of F. nucleatum). 
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Metabolism 



Air transportation 



Protein interactome 




Figure 1 : Modularity and degree distribution explain most degree-degree correlations in com- 
plex networks, a-d, Degree of the neighbors of a node normalized by the average neighbors' 
degree of all the nodes in the network; e-h, Degree dv of the neighbors of a node normalized 
by the average neighbors' degree of the node in the ensemble of random networks with fixed 
degree sequence; and i-1, Neighbors' degree d M of a node normalized by the average neighbors' 
degree of the node in the ensemble of random networks with fixed degree sequence and modular 
structure (Methods). Values of d are averaged over nodes with similar degree to obtain the func- 
tion d(k). Error bars represent the standard error of the average. Note that a lack of deviations 
from the ensemble average, that is, d(k) = 1, indicates the absence of correlations. The results 
in the middle row show that the degree distribution is responsible for some of the observed 
degree-degree correlations, but cannot fully account for them. The degree distribution and the 
modular structure of the network do account for most existing degree-degree correlations in the 
Internet, metabolic and air transportation networks. 
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Figure 2: Role-to-role connectivity patterns. We plot the ^-score for the abundance (Methods) 
of each link type for: a, stringy-periphery networks, and b, multi-star networks (see text). Roles 
are labeled as follows: (Rl) ultra-peripheral; (R2) peripheral; (R3) satellite connectors; (R4) 
kinless nodes; (R5) provincial hubs; (R6) connector hubs; (R7) global hubs, c, We quantify 
the similarity between two ,2-score profiles by means of the correlation coefficient (Methods), 
with yellow corresponding to large positive correlation, blue to large anti-correlation, and black 
to no correlation. Gray columns in a indicate those link types that contribute the most, in 
absolute value, to the correlation coefficient. These link types are, therefore, the ones that better 
characterize the set of all profiles. 
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Figure 3: Modules and role-to-role connectivity signatures in different network types. Each 
panel represents a single module (that is, all the nodes depicted belong to a single module) in: 
the metabolic network of A. thaliana, the Asia and Middle East air transportation network, the 
protein interactome of C. elegans, and the Internet in 1998. Different symbols indicate different 
node roles (see Supplementary Discussion for the names of the nodes). External links to other 
modules are not depicted, although it is possible to infer where they are from the role of the 
nodes. Shaded regions highlight important structural features. 



21 



